G : Bandits , Experts and Games 09 / 19 / 16 Lecture 3 : Lower Bounds for Bandit Algorithms

نویسندگان

Alex Slivkins

Karthik A Sankararaman

چکیده

Note that (2) implies (1) since: if regret is high in expectation over problem instances, then there exists at least one problem instance with high regret. Also, (1) implies (2) if |F| is a constant. This can be seen as follows: suppose we know that for any algorithm we have high regret (say H) with one problem instance in F and low regret with all other instances in F , then, taking a uniform distribution over F , we can say that any algorithm has expected regret at least H/|F|. (So this argument breaks if |F| is large.) If we prove a stronger version of (1) that says that for any algorithm, regret is high for a constant fraction of the problem instances in F , then, considering a uniform distribution over F , this implies (2) regardless of whether |F| is large or not. In this lecture, for proving lower bounds, we consider 0-1 rewards and the following family of problem instances (with fixed to be adjusted in the analysis):

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

G : Bandits , Experts and Games 09 / 12 / 16 Lecture 4 : Lower Bounds ( ending ) ; Thompson Sampling

Here is a parameter to be adjusted in the analysis. Recall that K is the number of arms. We considered a “bandits with predictions” problem, and proved that it is impossible to make an accurate prediction with high probability if the time horizon is too small, regardless of what bandit algorithm we use to explore and make the prediction. In fact, we proved it for at least a third of problem ins...

متن کامل

G : Bandits , Experts and Games 10 / 10 / 16 Lecture 6 : Lipschitz Bandits

Motivation: similarity between arms. In various bandit problems, we may have information on similarity between arms, in the sense that ‘similar’ arms have similar expected rewards. For example, arms can correspond to “items” (e.g., documents) with feature vectors, and similarity can be expressed as some notion of distance between feature vectors. Another example would be the dynamic pricing pro...

متن کامل

Bandit-Based Estimation of Distribution Algorithms for Noisy Optimization: Rigorous Runtime Analysis

We show complexity bounds for noisy optimization, in frameworks in which noise is stronger than in previously published papers[19]. We also propose an algorithm based on bandits (variants of [16]) that reaches the bound within logarithmic factors. We emphasize the differences with empirical derived published algorithms.

متن کامل

Contextual Bandits with Stochastic Experts

We consider the problem of contextual bandits with stochastic experts, which is a variation of the traditional stochastic contextual bandit with experts problem. In our problem setting, we assume access to a class of stochastic experts, where each expert is a conditional distribution over the arms given a context. We propose upper-confidence bound (UCB) algorithms for this problem, which employ...

متن کامل

Stochastic and Adversarial Combinatorial Bandits

This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting, we first derive problemspecific regret lower bounds, and analyze how these bounds scale with the dimension of the decision space. We then propose COMBUCB, algorithms that efficiently exploit the combinatorial structure of the problem, and derive finitetime upper bound on thei...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

G : Bandits , Experts and Games 09 / 19 / 16 Lecture 3 : Lower Bounds for Bandit Algorithms

نویسندگان

چکیده

منابع مشابه

G : Bandits , Experts and Games 09 / 12 / 16 Lecture 4 : Lower Bounds ( ending ) ; Thompson Sampling

G : Bandits , Experts and Games 10 / 10 / 16 Lecture 6 : Lipschitz Bandits

Bandit-Based Estimation of Distribution Algorithms for Noisy Optimization: Rigorous Runtime Analysis

Contextual Bandits with Stochastic Experts

Stochastic and Adversarial Combinatorial Bandits

عنوان ژورنال:

اشتراک گذاری